A Nomogram for Predicting Cancer‐Specific Survival in Young Patients With Advanced Lung Cancer Based on Competing Risk Model

ABSTRACT Background Young lung cancer is a rare subgroup accounting for 5% of lung cancer. The aim of this study was to compare the causes of death (COD) among lung cancer patients of different age groups and construct a nomogram to predict cancer‐specific survival (CSS) in young patients with advanced stage. Methods Lung cancer patients diagnosed between 2004 and 2015 were extracted from the Surveillance, Epidemiology, and End Results (SEER) database and stratified into the young (18–45 years) and old (> 45 years) groups to compare their COD. Young patients diagnosed with advanced stage (IVa and IVb) from 2010 to 2015 were reselected and divided into training and validation cohorts (7:3). Independent prognostic factors were identified through the Fine‐Gray's test and further integrated to the competing risk model. The area under the receiver operating characteristic curve (AUC), consistency index (C‐index), and calibration curve were applied for validation. Results The proportion of cancer‐specific death (CSD) in young patients was higher than that in old patients with early‐stage lung cancer (p < 0.001), while there was no difference in the advanced stage (p = 0.999). Through univariate and multivariate analysis, 10 variables were identified as independent prognostic factors for CSS. The AUC of the 1‐, 3‐, and 5‐year prediction of CSS was 0.688, 0.706, and 0.791 in the training cohort and 0.747, 0.752, and 0.719 in the validation cohort. The calibration curves demonstrated great accuracy. The C‐index of the competing risk model was 0.692 (95% CI: 0.636–0.747) in the young patient cohort. Conclusion Young lung cancer is a distinct entity with a different spectrum of competing risk events. The construction of our nomogram can provide new insights into the management of young patients with lung cancer.


| Introduction
Lung cancer is more prevalent in the elderly population with high incidence and mortality, while young patients are considered a distinct entity [1,2].There is no unified definition of young lung cancer, and most studies currently define it as younger than 45 years old and older than 18 years old [3,4].As a rare subgroup accounting for 5% of lung cancers, young patients are more likely to be female, non-White, nonsmokers, and adenocarcinoma [5][6][7].Furthermore, young lung cancer has more aggressive tumor behavior with an advanced clinical stage at diagnosis, which may be related to insidious onset, atypical symptoms, difficulties in early diagnosis, and misdiagnosis [7][8][9].
Young lung cancer has better prognosis in median survival and overall survival (OS) than those in old patients due to more aggressive treatment intentions and better tolerability of combination therapies [9,10].In addition, the distinctive genetic characteristics observed in young patients suggest that they are often more amenable to targeted therapies, which is associated with favorable prognoses [11,12].Undeniably, multiple studies have indicated that young age is an independent prognostic protective factor for lung cancer [6,[13][14][15].Moreover, in elderly patients with lung cancer, the presence of multiple comorbidities often leads to a significant increase in the probability of facing competing risk events with longer survival [16][17][18].However, competing risk events in young lung cancer have not been thoroughly investigated yet.
Nomograms, which estimate the individual probability of recurrence or death based on independent prognostic factors, have become a widely used tool in clinical practice [19,20].Currently, a few studies have developed nomograms to predict OS and cancer-specific survival (CSS) in young patients with non-small cell lung cancer (NSCLC), with limited consideration of competitive risk events or comprehensive evaluation of independent prognostic factors [21].Considering the extremely poor prognosis, the management of advanced-stage patients is often more challenging.However, unlike in elderly lung cancer, the analysis of competitive risk events and competitive risk assessment for advanced-stage young patients remains deficient.To explore the impact of different demographic characteristics and treatments on the prognosis of young lung cancer patients with advanced stage, we collected cases from the Surveillance, Epidemiology, and End Results (SEER) database between 2010 and 2015.Further, we constructed and validated a nomogram for young lung cancer patients by using competing risk model, hoping to provide an excellent reference for clinical decision-making.

| Data Source and Selection
The SEER database contains information about the cancer incidence and survival of approximately 34.6% of the American population.Patients diagnosed with lung cancer were extracted from the SEER database.We reclassified the TNM stage based on Collaborative Stage (CS) information according to the AJCC eighth edition.A cohort comprising young and old patients was selected to compare the difference in death causes between different age groups.In the death cause analysis section, we kept as many samples with accurate causes of death to allow more accurate comparisons of their differences between different age groups.The inclusion criteria were as follows: (1) malignant tumor in lung and bronchus (primary sites: C34.0-34.9)from 2004 to 2015; (2) patients aged ≥ 18 and patients under 45 years old were considered as young lung cancer; and (3) lung cancer histology codes ranging from 8012 to 8576 of ICD-O-3.Subsequently, to construct the competing risk model for young lung cancer, we reselected patients aged 18-45 and diagnosed with advanced (IVa and IVb) lung cancer from 2010 to 2015 as the target cohort.This range was selected since SEER combined mets have been available since 2010.Additionally, patients diagnosed after 2015 were excluded to ensure an adequate follow-up time.And to select the accurate target group, we kept the sample of lung cancer patients with pathological diagnosis evidence and eliminated samples only diagnosed by less reliable diagnostic methods, such as clinical manifestation and radiography.Other exclusion criteria were as follows: (1) unknown TNM stage, unknown race, unknown metastases, and unknown treatment modality; (2) patients not diagnosed with positive histology or exfoliative cytology; and (3) patients diagnosed with death certificates only.For patients whose survival time was less than 1 month, the SEER database records the survival time as 0 months.Nevertheless, considering the accuracy of the data, the survival time of these patients was recorded as 0.5 months.The patient selection procedure is provided in Figure 1.

| Comparison of the Cause of Death (COD) in Two Groups
To identify the difference in the COD between different age groups, the total cohorts were divided into the old group (aged > 45) and the young group (aged 18-45).All-cause mortality and cancer-specific mortality were analyzed and compared between the two groups.OS was defined as the time from diagnosis to all-cause death while CSS was defined as the time from diagnosis to cancer-specific death (CSD).Censored data was defined as patients alive at the time of analysis or lost to follow-up.In addition, we counted competing mortality risk events in each group.The endpoint events consisted of CSD, other-cause death (OCD), and censored.CSD was the primary endpoint of interest, and OCD was the competing event for CSD.

| Predictive Variables and Construction of the Competing Risk Model
To predict the CSD and OCD of young advanced lung cancer patients, young patients had a random split into a training set and a validation set with a 7:3 ratio.The training set was used to establish the competing risk model, and the validation set was applied for external validation.Variables involved in the research included sex, age, race, TNM stage, bone metastasis, brain metastasis, lung metastasis, surgery, and chemotherapy.The distributional differences of each variable between the two cohorts were evaluated with the chi-square test.Through univariate analysis, the probability of CSD and OCD in groups with different variables was calculated by applying the cumulative incidence function (CIF).Differences between groups were then assessed using Fine-Gray's test.Subsequently, the variables with p < 0.05 in the univariate competing risk analysis were incorporated into the proportional subdistribution hazard model as multivariate analysis.Based on the model, we obtained the subdistribution hazard ratio (sdHR) and 95% confidence interval (95% CI) of each variable.Eventually, the nomogram of the 1-year, 3-year, and 5-year CSS rates was constructed based on the optimal regression model with significant predictive variables in the multivariate analysis (p < 0.05).

| Validation of the Competing Risk Model
To evaluate the model performance, the area under the receiver operating characteristic curve (AUC), consistency index (C-index), and calibration curve were applied to evaluate the discrimination and consistency of the competing risk model.The AUC was used to quantify the discrimination performance of the model, and a value greater than 0.7 indicates good discrimination [22].The C-index is a powerful indicator to evaluate the discrimination degree between the predicted value from the model and reality.The C-index value ranged from 0.50 to 1.00.It is generally believed that a higher C-index implied a better discrimination ability of the model [23].In our study, the C-index was calculated using the original dataset (1585 samples), and the 95% CIs were calculated through 1000 bootstrap samples.The calibration curve depicted the agreement between nomogrampredicted probabilities and observed risks.The calibration curve falling on a 45° diagonal line implied the high prediction accuracy of a model [24].

| Statistical Analysis
R software (version 3.4.1;http:// www.r-proje ct.org) was used to perform all statistical analyses.The R packages cmprsk, rms, and mstate were utilized to develop and validate the competing risk model.All statistical tests were two-sided, and p < 0.05 was considered to be statistically significant.

| Characteristics of Patients and Comparison of the COD in Two Groups
A total of 322 207 old patients and 8147 young patients with advanced lung cancer were enrolled in our study.Compared with the old group, the prognosis of the young group was significantly better, with a median OS of 18 months in the young group and 11 months in the old group (HR = 0.66 (0.65-0.68), p < 0.001) (Figure 2).The death rate and COD of the two groups were compared further.The proportion of deaths in the old group was higher than that in the young group in both early and advanced stages of lung cancer (64.85% vs. 38.33%,p < 0.001, 94.58% vs. 88.21%,p < 0.001) (Figure 3A-D).In terms of the COD, the proportion of CSD in young patients was higher than in old patients  with early-stage lung cancer (84.68% vs. 74.70%,p < 0.001) while there is no difference in the advanced stage (87.78% vs. 87.79%,p = 0.999) (Figure 3E-H).Additionally, the categories and proportions of OCD varied from different age groups and different stages (Figure 4).Besides lung cancer, old patients with earlystage lung cancer mainly died of some chronic diseases, such as heart disease and chronic obstructive pulmonary disease.In advanced-stage lung cancer, death resulting from malignant cancer accounted for the largest proportion of OCD (Figure 4A,B).For young patients, miscellaneous malignant cancer and diseases of the heart were the common COD in both early and advanced stages (Figure 4C,D).

| Factors Associated with CSS
A total of 1585 young patients were randomly allocated into the training cohort (n = 1109) and validation cohort (n = 476) at a ratio of 7:3.The baseline characteristics of the two cohorts are shown in Table 1.Specifically, most of the patients in our study were female (50.5%), had lung adenocarcinoma (69%), and received chemotherapy therapy (80%).Through the univariate analysis, we calculated the 3-year and 5-year cumulative incidence of CSD and OCD grouped by different variables (Table 2 and Table S1).The corresponding CIF curves are presented in Figure 5. Based on Fine-Gray's test results, all factors, except primary site, laterality, brain metastasis, and lung metastasis, had strong correlations with CSD (p < 0.05).Patients with a high cumulative incidence of CSD were those with characteristics of male and older age.Earlier T and N stage, grade I, nonbone metastasis, and nonliver metastasis decreased the cumulative incidence of CSD.CIF for CSD differed significantly between those with and without surgery and chemotherapy.However, there existed few significant variables for the OCD event in the univariate analysis (  bone metastasis, surgery, and chemotherapy were considered independent prognostic factors for CSS (p < 0.05) (Table 3).

| Nomogram Construction and Validation Based on Competing Risk Model
Based on the independent prognostic factors, a nomogram was constructed to predict the probability of CSS for young patients with advanced lung cancer (Figure 6).Grade made the largest contribution to the prognosis of young patients, followed by surgery, chemotherapy, and histological types.The total score was the sum of the score of each factor, and the 1-, 3-, and 5-year survival rates were calculated by drawing a vertical line from the total score on the nomogram.The AUC for the competing risk nomogram in the 1-, 3-, and 5-year prediction of CSD was 0.688, 706, and 791 and 0.747, 0.752, and 0.719 in the training cohort and the validation cohort, respectively (Figure 7).The predicted calibration curves were close to the standard curves for 1-, 3-, and 5-year survival in both the training set and validation set, which revealed high consistency of the nomogram (Figure 8).Through the cross-validation by bootstrap, the C-index values were 0.692 (95% CI: 0.636-0.747),indicating that the predictive model had great discrimination.

| Discussion
Our retrospective study indicates that young lung cancer is a distinct entity from elderly patients, with significantly different competitive risk events.Furthermore, to the best of our knowledge, this is the first study to establish a competitive risk nomogram model for advanced young lung cancer.
The current results regarding the prognosis of young lung cancer patients remain controversial.Thomas et al. discovered that both OS and lung CSS (LCSS) of young patients were superior to those of elderly patients [7].However, some research reported that the prognosis had no significant difference between metastatic young patients and older patients [11].[25].And the difference has narrowed significantly with the progression of cancer staging [26], which may contribute that disease itself plays a dominant role in determining the patient prognosis in advanced stages of lung cancer.Whereas in the early stages of the disease, the prognosis of elderly patients is affected by both the disease itself and comorbidities associated with the age [11].
For lung cancer patients, the impact of comorbidity on prognosis will gradually increase with the extension of survival time [27].Therefore, analyzing the COD in young patients with advanced lung cancer may facilitate the development of oncologic treatments and care strategies, which can improve the prognostic outcomes ultimately.In our study, we found that the major competing risk events of advanced young lung cancer were mostly consistent with those of old patients, including other malignancies and heart-related diseases.Nevertheless, compared to older patients, young patients with advanced lung cancer have a lower proportion of deaths attributed to comorbidities such as cardiovascular disease, COPD, and cerebrovascular disease, which are commonly associated with smoking in the elderly population [28].It is worthwhile to note that young lung cancer patients have a higher proportion of nontumor deaths from cardiovascular, which may be relevant to the toxicity of cancer treatment, including radiotherapy and chemotherapy [29].Additionally, we have noticed a higher proportion of accidents and adverse effects in young patients with lung cancer.These findings underscore the need for more attention to adverse event management in the treatment of young patients with lung cancer, as well as the importance of addressing their psychological well-being.
Previous studies on the prognosis of advanced-stage young lung cancer have mainly relied on OS, with little consideration for the impact of competing events.And non-lung cancerspecific mortality may interfere with the prediction of lung cancer-specific mortality when there are competing events for the outcome [30,31].Therefore, it is more appropriate to apply the competing risk model to construct the prognostic model in the presence of competing events.Peng and Sun established a nomogram for OS and CSS for young NSCLC, both of which outperformed the TNM staging system in terms of predictive performance [21].Despite establishing a nomogram based on the competitive risk model, their study was restricted to NSCLC and failed to take into account risk factors such as metastasis.Considering the urgent need for clinical prognostic guidance in late-stage young adult lung cancer, we have developed and validated a nomogram based on a large population from the SEER database.
In the present study, 10 independent risk factors associated with CSS, which involve sex, age group, race, grade, histological types, T stage, N stage, bone metastasis, surgery, and chemotherapy were identified through competing risk analysis for young patients with advanced lung cancer.Consistent with previous studies [25,32], our research found that advanced-stage young lung cancer patients were predominantly adenocarcinoma, females, which were both protective prognostic factors for lung cancer patients.These patients also tend to have more concomitant mutated genes [3].Multiple studies have shown that the genetic features of young-onset lung cancer patients are distinct from those of older patients, which makes young-onset lung cancer patients benefit from targeted therapies directed at these genes [33].
After integrating all independent prognostic factors, we developed a nomogram based on a competitive risk model to predict tumor-specific survival.Our model demonstrated high discriminative and predictive accuracy as evidenced by high C-index and AUC values.The calibration curves showed excellent consistency between the predicted and observed probabilities of 1year, 2-year, and 5-year CSS, both in the training and validation sets.In clinical applications, our model is user-friendly as all parameters are readily accessible.
Nonetheless, it is necessary for us to further investigate the parameters in our model.We analyzed multiple types of metastases in contrast to Peng's study and subsequently incorporated bone metastasis into our model.Previous studies have shown that young NSCLC patients are at a higher risk of bone metastasis, which is probably attributed to higher bone marrow flow and circulating tumor cells in the skeletal system of young patients [34,35].All of these indicate that bone metastasis is a crucial indicator in the prognosis of young adult lung cancer when evaluating prognosis.Furthermore, in our model, there is a higher proportion of pathological staging, surgery, and chemotherapy.Especially, the risk scores of no surgery and no chemotherapy were both higher than 60 points, indicating a significant impact on prognosis and playing a momentous role in guiding clinical practice.The conventional perspective holds that the treatment of metastatic lung cancer is mainly palliative rather than surgical [36].Recent studies have indicated that surgery may be appropriate for particular patients with advanced NSCLC who have limited metastases and can bring significant survival benefits [37,38].Additionally, studies have shown that younger age, smaller primary lung tumors, N0 stage, and receiving lobectomy have better surgical benefits in advanced patients [37].As mentioned above, young patients have fewer  some potential benefits from active therapy for young patients with advanced lung cancer, SEER database does not include detailed treatment information, thereby affecting the further adoption of this finding to the clinical application.Future studies will need to collect the concrete surgery types and chemotherapeutic agents in the prospective study to identify the specific subgroup of young lung cancer benefiting from the active therapy.Secondly, previous studies have shown that the genetic characteristics of young lung cancer patients different from the elderly patients are related to their better treatment prognosis [33].Since the genetic data were not addressed in the SEER database, our study lacks further exploration of the relationship between genes and prognosis.For future research, studying the genetic characteristics of specific young lung cancer patients who benefit from positive treatment can be conducive to guiding the clinical decision of treatment plans.

| Conclusion
In our manuscript, we found that young lung cancer is a distinct entity from elderly patients with significantly different prognoses and the spectrum of competing risk events.Our study also proposed a useful tool to predict the CSS of young patients with advanced lung cancer.Our finding may facilitate personalized clinical decisions, suggesting beware of the adverse effects of the treatment process on the heart of the patients and indicating potential benefits from the surgery for a subgroup of young patients with advanced lung cancer.

FIGURE 1 |
FIGURE 1 | Flow diagram illustrating the screening process in the SEER database.

FIGURE 2 |
FIGURE 2 | Cumulative Kaplan-Meier estimates of rates of allcause mortality for young lung cancer patients versus old lung cancer patients.

FIGURE 3 |FIGURE 4 |
FIGURE 3 | The causes of death analysis of young lung cancer patients and old lung cancer patients.(A, B) The proportion of the dead and surviving patients in the old group with early-stage and advanced lung cancer.(C, D) The proportion of the dead and surviving patients in the young group with early-stage and advanced lung cancer.(E, F) Cancer-specific death and other-cause death rate in the old group with early-stage and advanced lung cancer.(G, H) Cancer-specific death and other-cause death rate in the young group with early-stage and advanced lung cancer.

FIGURE 5 |
FIGURE 5 | Cumulative incidence curve of cancer-specific mortality and other-cause death solid line: lung cancer-specific death; dotted line: other-cause death.

FIGURE 7 |
FIGURE 7 | ROC curve for predicting the cancer-specific survival in young patients with advanced lung cancer.(A) ROC curves of 1-, 3-, and 5year CSS rates in the training set.(B) ROC curves of 1-, 3-, and 5-year CSS rates in the validation set.

TABLE 1 |
Clinical characteristics of young patients with advanced lung cancer. (Continues)

Table S1
) Through multivariate analysis, sex, age group, race, grade, histological types, T stage, N stage,

TABLE 2 |
Univariate competing risk analysis for lung cancer-specific mortality.

TABLE 3 |
Proportional subdistribution hazard model for lung cancer-specific mortality.
comorbidities and better treatment tolerance.Combined with our model, we believe that young patients with limited metastasis in advanced lung cancer may benefit from surgery.We acknowledge certain limitations in our study.First, this study was limited by the data collection of a retrospective study, leading to unavoidable bias.Additionally, although we found FIGURE 6 | Nomogram for predicting cancer-specific survival in young patients with advanced lung cancer.